Feature weighting in DBSCAN using reverse nearest neighbours
نویسندگان
چکیده
DBSCAN is arguably the most popular density-based clustering algorithm, and it capable of recovering non-spherical clusters. One its main weaknesses that treats all features equally. In this paper, we propose a algorithm calculating feature weights representing degree relevance each feature, which takes density structure data into account. First, improve introduce new called DBSCANR. DBSCANR reduces number parameters to one. Then, step introduced process iteratively update based on current partition data. The produced by weighted version W-DBSCANR, measure variables in can be used selection mining applications where large complex real-world are often involved. Experimental results both artificial have shown algorithms outperformed various type clusters
منابع مشابه
Feature Reduction and Nearest Neighbours
Feature reduction is a major preprocessing step in the analysis of highdimensional data, particularly from biomolecular high-throughput technologies. Reduction techniques are expected to preserve the relevant characteristics of the data, such as neighbourhood relations. We investigate the neighbourhood preservation properties of feature reduction empirically and theoretically. Our results indic...
متن کاملInstance and Feature Weighted k-Nearest-Neighbours Algorithm
We present a novel method that aims at providing a more stable selection of feature subsets when variations in the training process occur. This is accomplished by using an instance-weighting process –assigning different importances to instances– as a preprocessing step to a feature weighting method that is independent of the learner, and then making good use of both sets of computed weigths in ...
متن کاملNearest Neighbours Search Using the PM-Tree
We introduce a method of searching the k nearest neighbours (k-NN) using PM-tree. The PM-tree is a metric access method for similarity search in large multimedia databases. As an extension of M-tree, the structure of PM-tree exploits local dynamic pivots (like M-tree does it) as well as global static pivots (used by LAESA-like methods). While in M-tree a metric region is represented by a hyper-...
متن کاملThe Utility of Feature Weighting in Nearest-Neighbor Algorithms
Nearest-neighbor algorithms are known to depend heavily on their distance metric. In this paper, we investigate the use of a weighted Euclidean metric in which the weight for each feature comes from a small set of options. We describe Diet, an algorithm that directs search through a space of discrete weights using cross-validation error as its evaluation function. Although a large set of possib...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition
سال: 2023
ISSN: ['1873-5142', '0031-3203']
DOI: https://doi.org/10.1016/j.patcog.2023.109314